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(54) SOUND ENCODING METHOD AND SOUND DECODING METHOD, AND SOUND ENCODING 
DEVICE AND SOUND DECODING DEVICE 



(57) A high quality speech is reproduced with a 
small data amount in speech coding and decoding for 
performing compression coding and decoding of a 
speech signal to a digital signal. 

In speech coding method according to a code- 
excited linear prediction (CELP) speech coding, a noise 
level of a speech in a concerning coding period is eval- 
uated by using a code or coding result of at least one of 
spectrum information, power information, and pitch 
information, and various excitation codebooks are used 
based on an evaluation result 
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Description 

Technical Field 

[0001] This invention relates to methods for speech 
coding and decoding and apparatuses for speech cod- 
ing and decoding for performing compression coding 
and decoding of a speech signal to a digital signal. Par- 
ticularly, this invention relates to a method for speech 
coding, method for speech decoding, apparatus for 
speech coding and apparatus for speech decoding for 
reproducing a high quality speech at low bit rates. 

Background Art 

[0002] In the related art, code-excited linear predic- 
tion (Code-Excited Linear Prediction: CELP) coding is 
well-known as an efficient speech coding method, and 
its technique is described in "Code-excited linear pre- 
diction (CELP): High-quality speech at very low bit 
rates," ICASSP '85, pp. 937 - 940. by M. R. Shroeder 
and B. S. Atal in 1985. 

[0003] Rg. 6 illustrates an example of a whole con- 
figuration of a CELP speech coding and decoding 
method. In Fig. 6. an encoder 101. decoder 102. multi- 
plexing means 103, and dividing means 104 are illus- 
trated. 

[0004] The encoder 101 includes a linear prediction 
parameter analyzing means 105, linear predtetion 
parameter coding means 106, synthesis filter 107, 
adaptive codebook 1 08, excitation codebook 1 09. gain 
coding means 110, distance calculating means 111, 
and weighting-adding means 138. The decoder 102 
includes a linear prediction parameter decoding means 
112, synthesis filter 1 13, adaptive codebook 114, exci- 
tation codebook 115, gain decoding means 116, and 
weighting-adding means 139. 

[0005] In CELP speech coding, a speech in a frame 
of about 5 - 50 ms is divided into spectrum information 
and excitation information, and coded. 
[0006] Explanations are made on operations in the 
CELP speech coding method. In the encoder 101, the 
linear prediction parameter analyzing means 1 05 ana- 
lyzes an input speech SI 01, and extracts a linear pre- 
diction parameter, which is spectrum information of the 
speech. The linear prediction parameter coding means 
1 06 codes the linear prediction parameter, and sets a 
coded linear prediction parameter as a coefficient for 
the synthesis filter 1 07. 

[0007] Explanations are made on coding of excita- 
tion information. 

[0008] An old excitation signal is stored in the adap- 
tive codebook 1 08. The adaptive codebook 108 outputs 
a time series vector, corresponding to an adaptive code 
inputted by the distance cateulator 111, which is gener- 
ated by repeating the old excitation signal periodically. 
[0009] A plurality of time series vectors trained by 
reducing a distortion between a speech for training and 



its coded speech for example is stored In the excitation 
codebook 109. The excitation codebook 109 outputs a 
time series vector corresponding to an excitation code 
inputted by the distance calculator 111. 

5 [0010] Each of the time series vectors outputted 
from the adaptive codebook 108 and excitation code- 
book 109 is weighted by using a respective gain pro- 
vided by the gain coding means 1 1 0 and added by the 
weighting-adding means 138. Then, an addition result is 

10 provided to the synthesis filter 1 07 as excitation signals, 
and a coded speech is produced. The distance calculat- 
ing means 1 1 1 calculates a distance between the coded 
speech and the input speech SI 01, and searches an 
adaptive code, excitation code, and gains for minimizing 

75 the distance. When the above-stated coding is over, a 
linear prediction parameter code and the adaptive code, 
excitation code, and gain codes for minimizing a distor- 
tion between the input speech and the coded speech 
are outputted as a coding result 

20 [0011] Explanations are made on operations in the 
CELP speech decoding method. 

[0012] In the decoder 102, the linear prediction 
parameter decoding means 1 12 decodes the linear pre- 
diction parameter code to the linear prediction parame- 

25 ter, and sets the linear prediction parameter as a 
coefficient for the synthesis filter 113. The adaptive 
codebook 114 outputs a time series vector correspond- 
ing to an adaptive code, which is generated by repeat- 
ing an old excitation signal periodically. The excitation 

30 codebook 115 outputs a time series vector correspond- 
ing to an excitation code. The time series vectors are 
weighted by using respective gains, which are decoded 
from the gain codes by the gain decoding means 116, 
and added by the weighting-adding means 139. An 

35 addition result is provided to the synthesis filter 1 13 as 
an excitation signal, and an output speech 8103 is pro- 
duced. 

[001 3] Among the CELP speech coding and decod- 
ing method, an improved speech coding and decoding 
40 method for reproducing a high quality speech according 
to the related art is described in "Phonetically - based 
vector excitation coding of speech at 3.6 kbps," ICASSP 
'89, pp. 49 - 52, by S. Wang and A. Gersho in 1 989. 
[0014] Fig. 7 shows an example of a whole configu- 
45 ration of the speech coding and decoding method 
according to the related art, and same signs are used 
for means corresponding to the means in Rg. 6. 
[0015] In Fig. 7, the er>coder 1 01 includes a speech 
state deciding means 117, excitation codebook switch- 
so ing means 118, first excitation codebook 1 1 9, and sec- 
ond excitation codebook 120. The decoder 102 includes 
an excitation codebook switching means 121, first exci- 
tation codebook 122, and second excitation codebook 
123. 

55 [0016] Explanations are made on operations in the 
coding and decoding method in this configuration. In the 
encoder 101, the speedi state deciding means 117 
analyzes the input speech S101 , arKi decides a state of 
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the speech is which one of two states, e.g., voiced or 
unvoiced. The excitation codebook switching means 
118 switches the excitation codebooks to be used in 
coding based on a speech state deciding result. For 
example, rf the speech is voiced, the first excitation 5 
codebook 119 is used, and it the speech is unvoiced, 
the second excitation codebook 120 is used. Then, the 
excitation codebook switching means 118 codes which 
excitation codebook is used in coding. 
[0017] In the decoder 102, the excitation codebook 10 
switching means 121 switches the first excitation code- 
book 122 and the second excitation codebook 123 
based on a code showing which excitation codebook 
was used in the encoder 101, so that the excitation 
codebook, which was used in the encoder 101, is used is 
in the decoder 1 02. . According to this configuration, 
excitation codebooks suitable for coding in various 
speech states are provided, and the excitation code- 
books are switched based on a state of an input speech. 
Hence, a high quality speech can be reproduced. 20 
[0018] A speech coding and decoding method of 
switching a plurality of excitation codebooks without 
increasing a transmission bit number according to the 
related art is disclosed in Japanese Unexamined Pub- 
lished Patent Application 8 - 185198. The plurality of 25 
excitation codebooks is switched based on a pitch fre- 
quency selected in an adaptive codebook, and an exci- 
tation codebook suitable for characteristics of an input 
speech can be used without increasing transmission 
data. 30 
[0019] As stated, in the speech coding and decod- 
ing method illustrated in Fig. 6 according to the related 
art, a single excitation codebook is used to produce a 
synthetic speech. Non-noise time series vectors with 
many pulses should be stored in the excitation code- 35 
book to produce a high quality coded speech even at 
low bit rates. Therefore, when a noise speech, e.g., 
background noise, fricative consonant, etc., is coded 
and synthesized, there is a problem that a coded 
speech produces an unnatural sound, e.g., "Jiri-Jiri" and 40 
"Chin-Chin." This problem can be solved, if the excita- 
tion codebook includes only noise time series vectors. 
However, in that case, a quality of the coded speech 
degrades as a whole. 

[0020] In the improved speech coding and decod- 45 
ing method illustrated in Fig. 7 according to the related 
art, the plurality of excitation codebooks is switched 
based on the state of the input speech for producing a 
coded speech. Therefore, it is possible to use an excita- 
tion codebook including noise time series vectors in an so 
unvoiced noise period of the Input speech and an exci- 
tation codebook including non-noise time series vectors 
in a voiced period other than the unvoiced noise period, 
for example. Hence, even if a noise speech is coded 
and synthesized, an unnatural sound, e.g., "Jiri-Jiri," is - ss 
not produced. However, since the excitation codebook 
used in coding is also used in decoding, it becomes 
necessary to code and transmit data which excitation 



codebook was used. It becomes an obstacle for lowing 
bit rates. 

[0021] According to the speech coding and decod- 
ing method of switching the plurality of excitation code- 
books without increasing a transmission bit number 
according to the related art, the excitation codebooks 
are switched based on a pitch period selected in the 
adaptive codebook. However, the pitch period selected 
in the adaptive codebook differs from an actual pitch 
period of a speech, and it is impossible to decide if a 
state of an input speech is noise or non-notse only from 
a value of the pitch period. Therefore, the problem that 
the coded speech in the noise period of the speech is 
unnatural cannot be solved. 

[0022] This invention was intended to solve the 
above-stated problems. Particularly, this invention aims 
at providing speech coding and decoding methods and 
apparatuses for reproducing a high quality speech even 
at low bit rates. 

Disclosure of the Invention 

[0023] In order to solve the above-stated problems, 
in a speech coding method according to this invention, 
a noise level of a speech in a concerning coding period 
is evaluated by using a code or coding result of at least 
one of spectrum information, power information, and 
pitch information, and one of a plurality of excitation 
codebooks is selected based on an evaluation result. 
[0024] In a speech coding method according to 
another invention, a plurality of excitation codebooks 
storing time series vectors with various noise levels is 
provided, and the plurality of excitation codebooks is 
switched based on an evaluation result of a noise level 
of a speech. 

[0025] In a speech coding method according to 
another invention, a noise level of time series vectors 
stored in an excitation codebook is changed based on 
an evaluation result of a noise level of a speech. 
[0026] In a speech coding method according to 
another invention, an excitation codebook storing noise 
time series vectors Is provided. A low noise time series 
vector is generated by sampling signal samples in the 
time series vectors based on the evaluation result of a 
noise level of a speech. 

[0027] In a speech coding method according to 
another invention, a first excitation codebook storing a 
noise time series vector and a second excitation code- 
book storing a non-noise time series vector are pro- 
vided. A time series vector is generated by adding the 
times series vector in the first excitation codebook and 
the time series vector in the second excitation codebook 
by weighting based on an evaluation result of a noise 
level of a speech. 

[0028] In a speech decoding method according to 
another invention, a noise level of a speech in a con- 
cerning decoding period is evaluated by using a code or 
coding result of at least one of spectrum infonnation. 
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power information, and pitch information, and one of the 
plurality of excitation codebooks is selected based on 
an evaluation result, 

[0029] In a speech decoding method according to 
another invention, a plurality of excitation codebooks s 
storing time series vectors with various noise levels is 
provided, and the plurality of excitation codebooks is 
switched based on an evaluation result of the noise level 
of the speech. 

[0030] In a speech decoding method according to io 
another invention, noise levels of time series vectors 
stored in excitation codebooks are changed based on 
an evaluation result of the noise level of the speech. 
[0031] In a speech decoding method according to 
another invention, an excitation codebook storing noise 75 
time series vectors is provided. A low noise time series 
vector is generated by sampling signal samples in the 
time series vectors based on the evaluation result of the 
noise level of the speech. 

[0032] In a speech decoding method according to 20 
another invention, a first excitation codebook storing a 
noise time series vector and a second excitation code- 
book storing a non-noise time series vector are pro- 
vided. A time series vector is generated by adding the 
times series vector in the first excitation codebook and 25 
the time series vector in the second excitation codebook 
by weighting based on an evaluation result of a noise 
level of a speech. 

[0033] A speech coding apparatus according to 
another invention includes a spectrum information 30 
encoder for coding spectrum information of an input 
speech and outputting a coded spectrum infonnation as 
an element of a coding result, a noise level evaluatorfor 
evaluating a noise level of a speech in a concerning 
coding period by using a code or coding result of at least 35 
one of the spectrum information and power information, 
which is obtained from the coded spectrum information 
provided by the spectrum information encoder and out- 
putting an evaluation result, a first excitation codebook 
storing a plurafity of non-noise time series vectors, a 40 
second excitation codebook storing a plurality of noise 
time series vectors, an excitation codebook switch for 
switching the first excitation codebook and the second 
excitation codebook based on the evaluation result by 
the noise level evaluator, a weighting-adder for weight- 45 
ing the time series vectors from the first excitation code- 
book and second excitation codebook depending on 
respective gains of the time series vectors and adding, 
a synthesis filter for producing a coded speech based 
on an excitation signal, which are weighted time series so 
vectors, and the coded spectrum information provided 
by the spectrum information encoder, and a distance 
calculator for calculating a distance between the coded 
speech and the input speech, searching an excitation 
code and gain for minimizing the distance, and output- 55 
ting a result as an excitation code, and a gain code as a 
coding result. 

[0034] A speech ,deopdir>g apparatus according to 
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another invention includes a spectrum infomnation 
decoder for decoding a spectrum information code to 
spectrum information, a noise level evaluator for evalu- 
ating a noise level of a speech in a concerning decoding 
period by using a decoding result of at least one of the 
spectrum information and power information, which is 
obtained from decoded spectrum information provided 
by the spectrum information decoder, and the spectrum 
information code and outputting an evaluating result, a 
first excitation codebook storing a plurality of non-noise 
time series vectors, a second excitation codebook stor- 
ing a pluraRty of noise time series vectors, an excitation 
codebook switch for switching the first excitation code- 
book and the second excitation codebook based on the 
evaluation result by the noise level evaluator, a weight- 
ing-adder for weighting the time series vectors from the 
first excitation codebook and the second excitation 
codebook depending on respective gains of the time 
series vectors and adding, and a synthesis filter for pro- 
ducing a decoded speech based on an excitation signal, 
which is a weighted time series vector, and the decoded 
spectrum information from the spectrum information 
decoder 

[0035] A ^eech coding apparatus according to this 
invention includes a noise level evaluator for evaluating 
a noise level of a speech in a concerning coding period 
by using a code or coding result of at least one of spec- 
trum information, power information, and pitch informa- 
tion and an excitation codebook switch for switching a 
plurality of excitation codebooks based on an evaluation 
result of the noise level evaluator in a code-excited lin- 
ear prediction (CELP) speech coding apparatus. 
[0036] A speech decoding apparatus according to 
this invention includes a noise level evaluator for evalu- 
ating a noise level of a speech in a concerning decoding 
period by using a code or decoding result of at least one 
of spectrum information, power information, and pitch 
information and an excitation codebook switch for 
switching a plurality of excitation codebooks based on 
an evaluation result of the noise evaluator in a code- 
excited linear prediction (CELP) speech decoding appa- 
ratus. 

Brief Description of the Drawings 
[0037] 

Rg. 1 shows a block diagram of a whole configura- 
tion of a speech coding and speech decoding appa- 
ratus in embodiment 1 of this invention. 

Rg. 2 shows a table for explaining an evaluation of 
a noise level in embodiment 1 of this invention illus- 
trated in Rg. 1 . 

Rg. 3 shows a block diagram of a whole configura- 
tion of a speech coding and speech decoding appa- 
ratus in embodiment 3 of this invention. 
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Fig. 4 shows a block diagram of a whole configura- 
tion of a speech coding and speech decoding appa- 
ratus in embodinnent 5 of this invention. 

Fig. 5 shows a schematic line chart for explaining a s 
decision process of weighting in embodiment 5 
illustrated in Rg. 4. 

Fig. 6 shows a block diagram of a whole configura- 
tion of a CELP speech coding and decoding appa- io 
ratus according to the related art 

Fig. 7 shows a block diagram of a whole configura- 
tion of an improved CELP speech coding and 
decoding apparatus according to the related art 75 

Best Mode for Carrying Out the Invention 

[0038] Explanations are made on embodiments of 
this invention with reference to drawings. 20 

Embodiment 1 . 

[0039] Fig. 1 illustrates a whole configuration of a 
speech coding method and speech decoding method in 25 
embodiment 1 according to this invention. In Rg. 1, an 
encoder 1 , a decoder 2, a multiplexer 3, and a divider 4 
are illustrated. The encoder 1 includes a linear predic- 
tion parameter analyzer 5, linear prediction parameter 
encoder 6, synthesis filter 7, adaptive codebook 8, gain 3o 
encoder 1 0, distance calculator 1 1 , first excitation code- 
book 19, second excitation codebook 20, noise level 
evaluator 24, excitation codebook switch 25, and 
weighting-adder 38. The decoder 2 includes a linear 
prediction parameter decoder 12, synthesis filter 13, 35 
adaptive codebook 14, first excitation codebook 22, 
second excitation codebook 23, noise level evaluator 
26, excitation codebook switch 27, gain decoder 16, and 
weighting-adder 39. In Fig. 1, the linear prediction 
parameter analyzer 5 is a spectrum information ana- 40 
lyzerfor analyzing an input speech SI and extracting a 
linear prediction parameter, which is spectrum informa- 
tion of the speech. The linear prediction parameter 
encoder 6 is a spectrum information encoder for coding 
the linear prediction parameter, which is the spectrum 4S 
information and setting a coded linear prediction param- 
eter as a coefficient for the synthesis filter 7. The first 
excitation codebooks 1 9 and 22 store pluralities of non- 
noise time series vectors, and the second excitation 
codebooks 20 and 23 store pluralities of noise time so 
series vectors. The noise level evaluators 24 and 26 
evaluate a noise level, and the excitation codebook 
switches 25 and 27 switch the excitation codebooks 
based on the noise level. 

[0040] Operations are explained. 55 
[0041] In the encoder 1, the linear prediction 
parameter analyzer 5 analyzes the input speech SI, 
and extracts a linear prediction parameter, which is 



spectrum information of the speech. The linear predic- 
tion parameter encoder 6 codes the linear prediction 
parameter. Then, the linear prediction parameter 
encoder 6 sets a coded linear prediction parameter as a 
coefficient for the synthesis filter 7, and also outputs the 
coded linear prediction parameter to the noise level 
evaluator 24. 

[0042] Explanations are made on coding of excita- 
tion information. 

[0043] An old excitation signal is stored in the adap- 
tive codebook 8, and a time series vector corresponding 
to an adaptive code inputted by the distance calculator 
11, which is generated by repeating an old excitation 
signal periodically, is outputted. The noise level evalua- 
tor 24 evaluates a noise level in a concerning coding 
period based on the coded linear prediction parameter 
inputted by the linear prediction parameter encoder 6 
and the adaptive code, e.g., a spectrum gradient, short- 
term prediction gain, and pitch fluctuation as shown in 
Rg. 2, and outputs an evaluation result to the excitation 
codebook switch 25. The excitation codebook switch 25 
switches excitation codebooks for coding based on the 
evaluation result of the noise level For example, if the 
noise level is low. the first excitation codebook 19 is 
used, and if the noise level is high, the second excitation 
codebook 20 is used. 

[0044] The first excitation codebook 1 9 stores a plu- 
rality of non-noise time series vectors, e.g., a plurality of 
time series vectors trained by reducing a distortion 
between a speech for training and its coded speech. 
The second excitation codebook 20 stores a plurality of 
noise time series vectors, e.g., a plurality of time series 
vectors generated from random noises. Each of the first 
excitation codebook 1 9 and the second excitation code- 
book 20 outputs a time series vector respectively corre- 
sponding to an excitation code inputted by the distance 
calculator 1 1 . Each of the time series vectors from the 
adaptive codebook 8 and one of first excitation code- 
book 1 9 or second excitation codebook 20 are weighted 
by using a respective gain provided by the gain encoder 
10, and added by the weighting-adder 38. An addition 
result is provided to the synthesis filter 7 as excitation 
signals, and a coded speech is produced. The distance 
calculator 1 1 calculates a distance between the coded 
speech and the input speech SI, and searches an 
adaptive code, excitation code, and gain for minimizing 
the distance. When this coding is over, the linear predic- 
tion parameter code and an adaptive code, excitation 
code, and gain code for minimizing the distortion 
between the input speech and the coded speech are 
outputted as a coding result S2. These are characteris- 
tic operations in the speech coding method in embodi- 
ment 1 . 

[0045] Explanations are made on the decoder 2. In 
the decoder 2, the linear prediction parameter decoder 
1 2 decodes the linear prediction parameter code to the 
linear prediction parameter, and sets the decoded linear 
prediction parameter as a coefficient for the synthesis 



5 



9 EP 1 

fitter 13, and outputs the decoded linear prediction 
parameter to the noise level evaluator 26. 
[0046] Explanations are made on decoding of exci- 
tation information. The adaptive codebook 14 outputs a 
time series vector corresponding to an adaptive code, 
which is generated by repeating an old excitation signal 
periodically. The noise level evaluator 26 evaluates a 
noise level by using the decoded linear prediction 
parameter inputted by the linear prediction parameter 
decoder 12 and the adaptive code in a same method 
with the noise level evaluator 24 In the encoder 1 , and 
outputs an evaluation result to the excitation codebook 
switch 27. The excitation codebook switch 27 switches 
the first excitation codebook 22 and the second excita- 
tion codebook 23 based on the evaluation resutt of the 
noise level in a same method with the excitation code- 
book switch 25 in the encoder 1 . 

[0047] A plurality of non-noise time series vectors, 
e.g., a plurality of time series vectors generated by train- 
ing for reducing a distortion between a speech for train- 
ing and its coded speech, is stored in the first excitation 
codebook 22. A plurality of noise time series vectors, 
e.g., a plurality of vectors generated from random 
noises, is stored in the second excitation codebook 23. 
Each of the first and second excitation codebooks out- 
puts a time series vector respectively corresponding to 
an excitation code. The time series vectors from the 
adaptive codebook 14 and one of first excitation code- 
book 22 or second excitation codebook 23 are weighted 
by using respective gains, decoded from gain codes by 
the gain decoder 16, and added by the weighting-adder 
39. An addition result is provided to the synthesis filter 
1 3 as an excitation signal, and an output speech S3 is 
produced. These are operations are characteristic oper- 
ations in the speech decoding method in embodiment 1 . 
[0048] In embodiment 1 , the noise level of the input 
speech is evaluated by using the code and coding 
result, and various excitation codebooks are used 
based on the evaluation result. Therefore, a high quality 
speech can be reproduced with a small data amount. 
[0049] In embodiment 1 , the plurality of time series 
vectors is stored in each of the excitation codebooks 1 9, 
20, 22, and 23. However, this embodiment can be real- 
ized as far as at least a time series vector is stored in 
each of the excitation codebooks. 

Embodiment 2. 

[0050] in embodiment 1 , two excitation codebooks 
are switched. However, it is also possible that three or 
more excitation codebooks are provided and switched 
based on a noise level. 

[0051] In embodiment 2, a suitable excitation code- 
book can be used even for a medium speech, e.g., 
slightly noisy, in addition to two kinds of speech, i.e., 
noise and non-noise. Therefore, a high quality speech 
can be reproduced. 
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Embodiment 3. 

[0052] Fig. 3 shows a whole configuration of a 
speech coding method and speech decoding method in 
5 embodiment 3 of this invention. In Fig. 3, same signs 
are used for units corresponding to the units in Rg. 1 . In 
Fig. 3, excitation codebooks 28 and 30 store noise time 
series vectors, and samplers 29 and 31 set an ampli- 
tude value of a sample with a low amplitude in the time 
10 series vectors to zero. 

[0053] Operations are explained. In the encoder 1, 
the linear prediction parameter analyzer 5 analyzes the 
input speech SI, and extracts a linear prediction param- 
eter, which is spectrum information of the speech. The 

15 linear prediction parameter encoder 6 codes the linear 
prediction parameter. Then, the linear prediction param- 
eter encoder 6 sets a coded linear prediction parameter 
as a coefficient for the synthesis filter 7, and also out- 
puts the coded linear prediction parameter to the noise 

20 level evaluator 24. 

[0054] Explanations are made on coding of excita- 
tion information. An old excitation signal is stored in the 
adaptive codebook 8. and a time series vector corre- 
sponding to an adaptive code inputted by the distance 

25 calculator 1 1 , which is generated by repeating an old 
excitation signal periodically, is outputted. The noise 
level evaluator 24 evaluates a noise level in a concern- 
ing coding period by using the coded linear prediction 
parameter, which is inputted from the linear prediction 

3o parameter encoder 6, and an adaptive code, e.g., a 
spectrum gradient, short-temn prediction gain, and pitch 
fluctuation, and outputs an evaluation result to the sam- 
pler 29. 

[0055] The excitation codebook 28 stores a plurality 

35 of time series vectors generated from random noises, 
for example, and outputs a time series vector corre- 
sponding to an excitation code inputted by the distance 
calculator 11. If the noise level is low in the evaluation 
result of th& noise, the sampler 29 outputs a time series 

4o vector, in which an amplitude of a sample with an ampli- 
tude below a determined value in the time series vec- 
tors, inputted fn^m the excitation codebook 28, is set to 
zero, for exannple. If the noise level is high, the sampler 
29 outputs the time series vector inputted from the exci- 

45 tation codebook 28 without modification. Each of the 
times series vectors from the adaptive codebook 8 and 
the sampler 29 is weighted by using a respective gain 
provided by the gain encoder 10 and added by the 
weighting-adder 38. An addition result is provided to the 

so synthesis filter 7 as excitation signals, and a coded 
speech is produced. The distance calculator 11 calcu- 
lates a distance between the coded speech and the 
input speech S1 . and searches an adaptive code, exci- 
tation code, and gain for minimizing the distance. When 

55 coding is over, the linear prediction parameter code and 
the adaptive code, excitation code, and gain code for 
minimizing a distortion between the input speech and 
the coded speech are outputted as a coding result S2. 
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These are characteristic operations in the speech cod- 
ing method in embodiment 3. 

[0056] Explanations are made on the decoder 2. In 
the decoder 2, the linear prediction parameter decoder 
1 2 decodes the linear prediction parameter code to the s 
linear prediction parameter. The linear prediction 
parameter decoder 12 sets the linear prediction param- 
eter as a coefficient for the synthesis filter 13. and also 
outputs the linear prediction parameter to the noise level 
evaIuator26. io 
[0057] Explanations are made on decoding of exci- 
tation information. The adaptive codebook 1 4 outputs a 
time series vector corresponding to an adaptive code, 
generated by repeating an old excitation signal periodi- 
cally The noise level evaluator 26 evaluates a noise is 
level by using the decoded linear prediction parameter 
inputted from the linear prediction parameter decoder 
12 and the adaptive code in a same method with the 
noise level evaluator 24 in the encoder 1 , and outputs 
an evaluation result to the sampler 31 . 20 
[0058] The excitation codebook 30 outputs a time 
series vector corresponding to an excitation code. The 
sampler 31 outputs a time series vector based on the 
evaluation result of the noise level in same processing 
with the sampler 29 in the encoder 1 . Each of the time 25 
series vectors outputted from the adaptive codebook 1 4 
and sampler 31 are weighted by using a respective gain 
provided by the gain decoder 16, and added by the 
weighting-adder 39. An addition result is provided to the 
synthesis filter 1 3 as an excitation signal, and an output 30 
speech S3 is produced. 

[0059] In embodiment 3, the excitation codebook 
storing noise time series vectors is provided, and an 
excitation with a low noise level can he generated by 
sampling excitation signal samples based on an evalua- 35 
tlon result of the noise level the speech. Hence, a high 
quality speech can be reproduced with a small data 
amount. Further, since it is not necessary to provide a 
plurality of excitation codebooks, a memory amount for 
storing the excitation codebook can be reduced. 40 

Embodiment 4. 

[0060] In embodiment 3, the samples in the time 
series vectors are either sampled or not. However, it is 45 
also possible to change a threshold value of an ampli- 
tude for sampling the samples based on the noise level. 
In embodiment 4, a suitable time series vector can be 
generated and used also for a medium speech, e.g., 
slightly noisy, in addition to the two types of speech, i.e., so 
noise and non-noise. Therefore, a high quality speech 
can be reproduced. 

Embodiment 5, 

55 

[0061] Fig. 4 shows .a whole confi£LU ration of a 
speech coding method and a speech decoding method 
in ambodiment 5 of this invention, and same signs are 



used for units con-esponding to the units in Rg. 1. 
[0062] In Fig. 4, first excitation codebooks 32 and 
35 store noise time series vectors, and second excita- 
tion codebooks 33 and 36 store non-noise time series 
vectors. The weight determiners 34 and 37 are also 
illustrated. 

[0063] Operations are explained. In the encoder 1 , 
the linear prediction parameter analyzer 5 analyzes the 
input speech SI , and extracts a linear prediction param- 
eter, which is spectrum information of the speech. The 
linear prediction parameter encoder 6 codes the linear 
prediction parameter. Then, the linear prediction param- 
eter encoder 6 sets a coded linear prediction parameter 
as a coefficient for the synthesis filter 7, and also out- 
puts the coded prediction parameter to the noise level 
evaluator 24. 

[0064] Explanations are made on coding of excita- 
tion information. The adaptive codebook 8 stores an old 
excitation signal, and outputs a time series vector corre- 
sponding to an adaptive code inputted by the distance 
calculator 11, which is generated by repeating an old 
excitation signal periodically. The noise level evaluator 
24 evaluates a noise level in a concerning coding period 
by using the coded linear prediction parameter, which is 
inputted from the linear prediction parameter encoder 6 
and the adaptive code, e.g., a spectrum gradient, short- 
term prediction gain, and pitch fluctuation, and outputs 
an evaluation result to the weight determiner 34. 
[0065] The first excitation codebook 32 stores a plu- 
rality of noise time series vectors generated from ran- 
dom noises, for example, and outputs a time series 
vector corresponding to an excitation code. The second 
excitation codebook 33 stores a plurality of time series 
vectors generated by training for reducing a distortion 
between a speech for training and its coded speech, 
and outputs a time series vector corresponding to an 
excitation code Inputted by the distance cabulator 1 1 . 
The weight determiner 34 determines a weight provided 
to the time series vector from the first excitation code- 
book 32 and the time series vector from the second 
excitation codebook 33 based on the evaluation result 
of the noise level inputted from the noise level evaluator 
24, as illustrated in Fig. 5, for example. Each of the time 
series vectors from the first excitation codebook 32 and 
the second excitation codebook 33 is weighted by using 
the weight provided by the weight determiner 34, and 
added. The time series vector outputted from the adap- 
tive codebook 8 and the time series vector, which is 
generated by being weighted and added, are weighted 
by using respective gains provided by the gain encoder 
10, and added by the weighting-adder 38. Then, an 
addition result is provided to the synthesis filter 7 as 
excitation signals, and a coded speech is produced. 
The distance calculator 1 1 calculates a distance 
between the coded speech and the input speech SI, 
and searches an adaptive code,, excitation code, and 
gain.fw pninimizing th^. distance. -When coding is over, 
the. linear prediction parameter code, adaptive code, 
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excitation code, and gain code for minimizing a distor- 
tion between the Input speech and the coded speech, 
are outputted as a coding result. 

[0066] Explanations are made on the decoder 2. In 
the decoder 2, the linear prediction parameter decoder 
12 decodes the linear prediction parameter code to the 
linear prediction parameter. Then, the linear prediction 
parameter decoder 12 sets the linear prediction param- 
eter as a coefficient for the synthesis filter 13, and also 
outputs the linear prediction parameter to the noise 
evaluator 26. 

[0067] Explanations are made on decoding of exci- 
tation information. The adaptive codebook 14 outputs a 
time series vector corresponding to an adaptive code by 
repeating an old exc'rtatron signal periodically. The noise 
level evaluator 26 evaluates a noise level by using the 
decoded linear prediction parameter, which is inputted 
from the linear prediction parameter decoder 12, and 
the adaptive code in a same method with the noise level 
evaluator 24 in the encoder 1 , and outputs an evaluation 
result to the weight determiner 37. 
[0068] The first excitation codebook 35 and the sec- 
ond excitation codebook 36 output time series vectors 
con-esponding to excitation codes. The weight deter- 
miner 37 weights based on the noise level evaluation 
result inputted from the noise level evaluator 26 in a 
same method with the weight determiner 34 in the 
encoder 1 . Each of the time series vectors from the first 
excitation codebook 35 and the second excitation code- 
book 36 is weighted by using a respective weight pro- 
vided by the weight determiner 37, and added. The time 
series vector outputted from the adaptive codebook 14 
and the time series vector, which is generated by being 
weighted and added, are weighted by using respective 
gains decoded from the gain codes by the gain decoder 
16, and added by the weighting-adder 39. Then, an 
addition result is provided to the synthesis filter 1 3 as an 
excitation signal, and an output speech S3 Is produced. 
[0069] In embodiment 5, the noise level of the 
speech is evaluated by using a code and coding result, 
and the noise time series vector or non-noise time 
series vector are weighted based on the evaluation 
result, and added. Therefore, a high quality speech can 
be reproduced with a small data amount 

Embodiment 6. 

[0070] In embodiments 1 - 5, it is also possible to 
change gain codebooks based on the evaluation result 
of the noise level. In embodiment 6, a most suitable gain 
codebook can be used based on the excitation code- 
book. Therefore, a high quality speech can be repro- 
duced. 

Embodiment 7. 

[0071] In embodiments 1 - 6, the noise level of the 
speech is evaluated, and the excitation codebooks are 



switched based on the evaluation result. However, It is 
also possible to decide and evaluate each of a voiced 
onset, plosive consonant, etc., and switch the excitation 
codebooks based on an evaluation result. In embodi- 
5 ment 7, in addition to the noise state of the speech, the 
speech is classified in more details, e.g., voiced onset, 
plosive consonant, etc., and a suitable excitation code- 
book can be used for each state. Therefore, a high qual- 
ity speech can be reproduced. 

w 

Embodiment 8. 

[0072] In embodiments 1-6, the noise level in the 
coding period is evaluated by using a spectrum gradi- 
75 ent, short-term prediction gain, pitch fluctuation. How- 
ever, it is also possible to evaluate the noise level by 
using a ratio of a gain value against an output from the 
adaptive codebook. 

20 Industrial Applicability 

[0073] In the speech coding method, speech 
decoding method, speech coding apparatus, and 
speech decoding apparatus according to this invention, 

25 a noise level of a speech in a concerning coding period 
is evaluated by using a code or coding result of at least 
one of the spectrum information, power information, and 
pitch information, and various excitation codebooks are 
used based on the evaluation result. Therefore, a high 

30 quality speech can be reproduced with a small data 
amount. 

[0074] In the speech coding method and speech 
decoding method according to this invention, a plurality 
of excitation codebooks storing excitations with various 

35 noise levels is provided, and the plurality of excitation 
codebooks is switched based on the evaluation result, 
of the noise level of the speech. Therefore, a high qual- 
ity speech can be reproduced with a small data amount. 
[0075] In the speech coding method and speech 

40 decoding method according to this Invention, the noise 
levels of the time series vectors stored in the excitation 
codebooks are changed based on the evaluation result 
of the noise level of the speech. Therefore, a high qual- 
ity speech can be reproduced with a small data amount. 

45 [0076] In the speech coding method and speech 
decoding method according to this invention, an excita- 
tion codebook storing noise time series vectors is pro- 
vided, and a time series vector with a low noise level is 
generated by sampling signal samples in the time series 

50 vectors based on the evaluation result of the noise level 
of the speech. Therefore, a high quality speech can be 
reproduced with a small data amount. 
[0077] In the speech coding method and speech 
decoding method according to this invention, the first 

55 excitation codebook storing noise time series vectors 
and the second excitation codebook storing non-noise 
time series vectors are provided, and the time series 
vector in the first excitation codebook or the time series 



8 



15 



EP 1 052 620 A1 



16 



vector in the second excitation codebook is weighted 
based on the evaluation result of the noise level of the 
speech, and added to generate a time series vector. 
Therefore, a high quality speech can be reproduced 
with a small data amount. 5 

Clainns 

1. A speech coding method according to a code- 
excited linear prediction (Code-Excited Linear Pre- io 
diction: CELP) speech coding method, comprising; 

evaluating a noise level of a speech in a con- 
cerning coding period by using a code or cod- 
ing result of at least one of spectrum i5 
information, power information, and pitch infor- 
mation; and 

selecting one of a plurality of excitation code- 
books based on an evaluation result. 

20 

2. The speech coding method of claim 1 . further com- 
prising: 

the plurality of excitation codebooks storing 
time series vectors with various noise levels; 25 
and 

switching the plurality of excitation codebooks 
based on the evaluation result of the noise level 

of the speech. 

30 

3. The speech coding method of claim 1 , further com- 
prising changing a noise level of time series vectors 
stored in the excitation codebooks based on the 
evaluation result of the noise level of the speech. 

35 

4. The speech coding method of claim 3, further com- 
prising: 

an excitation codebook storing noise time 
series vectors; and 40 

generating a low noise time series vector by 
sampling signal samples in the time series vec- 
tors based on the evaluation result of the noise 
level of the speech. 45 

5. The speech coding method of claim 3, further com- 
prising: 

a first excitation codebook storing a noise time so 
series vector and a second excitation code- 
book storing a non-noise time series vector; 
and 

gene rating a time series vector by adding the 
time series vector in the first excitation code- 55 
book and the time series vector in the second 
excitation codebook by weighting based on the 
evaluation result of the noise level of the 



speech. 

6. A speech decoding method according to a code- 
excited linear prediction (CELP) speech decoding 
method, comprising: 

evaluating a noise level of a speech in a con- 
cerning decoding period by using a code or 
decoding result of at least one of spectrum 
infonnatlon, power information, and pitch infor- 
mation; and 

selecting one of a plurality of excitation code- 
books based on an evaluation result. 

7. The speech decoding method of claim 6, further 
comprising: 

the plurality of excitation codebooks storing 
time series vectors with various noise levels; 
and 

switching the plurality of excitation codebooks 
based on the evaluation result of the noise level 
of the speech. 

8. The speech decoding method of claim 6, further 
comprising changing a noise level of time series 
vectors stored in the excitation codebooks based 
on the evaluation result of the noise level of the 

speech. 

9. The speech decoding method of claim 8, further 
comprising: 

an excitation codebook storing noise time 
series vectors; and 

generating a low noise time series vector by 
sampling signal samples in the time series vec- 
tors based on the evaluation result of the noise 
level of the speech. 

10. The speech decoding method of claim 8, further 
comprising: 

a first excitation codebook storing a noise time 
series vector and a second excitation code- 
book storing a non-noise time series vector; 
and 

generating a time series vector by adding the 
time series vector in the first excitation code- 
book and the time series vector in the second 
excitation codebook by weighting based on the 
evaluation result of the noise level of the 
speech. 

11. A speech coding apparatus, comprising: 
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a spectrum Information encoder for coding 
spectrum information of an Input speech, and 
outputting a coded spectrum information as an 
element of a coding result; 

5 

a noise level evaluator for evaluating a noise 
level of a speech in a concerning coding period 
by using a code or coding result of at least one 
of spectrum information and power information, 
obtained from the coded spectrum information io 
provided by the spectrum information encoder, 
and outputting an evaluation result; 

a first excitation codebook storing a plurality of 
non-noise time series vectors; is 

a second, excitation codebook storing a plural- 
ity of noise time series vectors; 

an excitation codebook switch for switching the 20 
first excitation codebook and the second exci- 
tation codebook based on the evaluation result 
by the noise level evaluator; 

a weighting-adder for weighting the time series 25 
vectors from the first excitation codebook and 
second excitation codebook depending on 
respective gains of the time series vectors and 
adding; 

30 

a synthesis filter for producing a coded speech 
based on an excitation signal, which is a 
weighted time series vector, and the coded 
spectrum information from the spectrum infor- 
mation encoder; and 35 

a distance calculator for calculating a distance 
between the coded speech and the input 
speech, searching an excitation code and gain 
for minimizing the distance, and outputting a 40 
result as an excitation code and a gain code as 
a coding result. 

A speech deccxiing apparatus, comprising: 

45 

a spectrum information decoder for decoding a 
spectrum information code to spectrum infor- 
mation; 

a noise level evaluator for evaluating a noise so 
level of a speech in a concerning decoding 
period by using a decoding result or the spec- 
trum infonnation code of at least one of spec- 
trum information and power information, 
obtained from decoded spectrum information ss 
provided by the spectrum infonmation decoder, 
and outputting an evaluation result; 



a first excitation codebook storing a plurality of 
non-noise tone series vectors; 

a second excitation codebook storing a plural- 
ity of noise time series vectors; 

an excitation codebook switch for switching the 
first excitation codebook and the second exci- 
tation codebook based on the evaluation result 
of the noise level evaluator; 

a weighting-adder for weighting the time series 
vectors from the first excitation codebook and 
second excitation codebook depending on 
respective gains of the time series vectors and 
adding; and 

a synthesis filter for producing a decoded 
speech based on an excitation signal, which is 
a weighted time series vector, and the decoded 
spectrum information from the spectrum infor- 
mation decoder. 

13. A speech coding apparatus according to a code- 
excited linear prediction (CELP) speech coding 
apparatus comprising: 

a noise level evaluator for evaluating a noise 
level of a speech in a concerning coding period 
by using a code or coding result of at least one 
of spectrum information, power information, 
and pitch information; and 

an excitation codebook switch for switching a 
plurality of excitation codebooks based on an 
evaluation result of the noise level evaluator. 

14. A speech decoding apparatus according to a code- 
excited linear prediction (CELP) speech decoding 
apparatus comprising : 

a noise level evaluator for evaluating a noise 
level of a speech in a concerning decoding 
period by using a code or decoding result of at 
least one of spectrum information, power infor- 
mation, and pitch information; and 

an excitation codebook switch for switching a 
plurality of excitation codebooks based on an 
evaluation result. 
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